This page is concerned with the programs, programming difficulties, and aberrant results in exploration of the congruence set behavior in the abstract predecessor tree. Motivation and discussion appear elsewhere.
These first Maple Programs do the heavy lifting.
The recursive program treegrow (used in the initial a priori integer-by-integer development of the predecessor tree) provides a nearly complete basis for the key routine (csetcurs) in the projected csetidea program (named after "congruence set idea"). All that is additionally required to construct the predecessor tree using congruence sets rather than indiviual integers is a subroutine (called densout) to make the upward traversal in the entire path from leaf node to the root while recording key data about each element's contents .
csetcurs := proc(path::string, cvalue, dvalue, levelp2, mxl)
global lda,depth,c,mylist,dlevel,outfile;
# fprintf(outfile,"Rsetcurs called w. %6s, %8d, %8d, %3d, %3d\n",
# path,cvalue,dvalue,levelp2,mxl);
#levelp2 is a parameter to rsetcurs and keeps the local level across deeper recursions
if mxl < nops(mylist) or 100000 < cvalue then
RETURN(NULL) fi;
mylist:=[op(mylist),[path,levelp2,cvalue,dvalue,evalf(1/dvalue)]];
# fprintf(outfile, `I%19s %20s %8d %9d %9d %9d %13e \n`,
# " ",path,cvalue,dvalue,cvalue,dvalue,evalf(1/dvalue));
#look at children of cvalue, itself
#it might be a leaf
if cvalue mod 3 = 0 then densout(path)
#it might be a b-step parent
elif cvalue mod 3 = 1 then rsetcurs(cat(path,"b"), 4/3*cvalue - 1/3, 4*dvalue, levelp2 + 2, mxl)
#it must be an s-step parent
else rsetcurs(cat(path,"s"), 2/3*cvalue - 1/3, 2*dvalue, levelp2 + 1, mxl)
fi;
#look at children of cvalue + dvalue (same possibilities for children)
if (cvalue + dvalue) mod 3 = 0 then densout(path)
elif (cvalue + dvalue) mod 3 = 1 then rsetcurs(cat(path,"b"),
4/3*cvalue + 4/3*dvalue - 1/3, 4*dvalue, levelp2 + 2, mxl)
else rsetcurs(cat(path,"s"),
2/3*cvalue + 2/3*dvalue - 1/3, 2*dvalue, levelp2 + 1, mxl)
fi;
#look at children of cvalue + 2*dvalue (same possibilities for children)
if (cvalue + 2*dvalue) mod 3 = 0 then densout(path)
elif (cvalue + 2*dvalue) mod 3 = 1 then rsetcurs(cat(path,"b"),
4/3*cvalue + 8/3*dvalue - 1/3, 4*dvalue, levelp2 + 2, mxl)
else rsetcurs(cat(path,"s"),
2/3*cvalue + 4/3*dvalue - 1/3, 2*dvalue, levelp2 + 1, mxl)
fi;
mylist := subsop(nops(mylist)=NULL,mylist);
RETURN(NULL)
end;
The program csetidea contains subroutines csetcurs and densout. Csetcurs contains an internal constant limiting the the magnitude of the c value which the program will penetrate to, and the invocation of csetcurs accepts a parameter which limits the depth of penetration through the recursive levels which will be performed.
densout := proc(path::string)
local n, p2, p3, i, step, listel, llevel, lcvalue, ldvalue;
global mylist,lda,depth,c,d,densitysum,outfile;
#data from main routine at depth reached for use here
#need local variables starting at passed in value
llevel := nops(mylist);
lcvalue := mylist[llevel,c];
ldvalue := mylist[llevel,d];
#the data structure will be a list of lists
#the inner list records the accumulated values for a particular higher level congruence set
#the outer list behaves as a stack to track the position in the abstract tree traversal
#to facilitate the addressing into the inner level of the list, 5 constants will be useful
#these are coded in csetidea procedure
#traverse the left descent assembly fragment upwards, i.e. from leaf to header,
#keeping lcvalue and ldvalue current for the llevel risen to
#llevel is local to densout and tracks upward path in mylist
fprintf(outfile,`%20s %8d %9d %20s %9d %9d %13e %13e\n`,
mylist[llevel,lda],mylist[llevel,c],mylist[llevel,d],mylist[nops(mylist),lda],
lcvalue,ldvalue,evalf(2/ldvalue),mylist[llevel,densitysum]);
for llevel from nops(mylist) -1 to 1 by -1 do;
step := substring(path,llevel+1..llevel+1);
if step="b" then lcvalue := (lcvalue*3+1)/4; ldvalue := 3/4*ldvalue
elif step="s" then lcvalue := (lcvalue*3+1)/2; ldvalue := 3/2*ldvalue
fi;
#the step's result is recorded in the left descent assembly elements produced in the main routine
listel := subsop(densitysum=mylist[llevel,densitysum]+evalf(2/ldvalue),mylist[llevel]);
mylist := subsop(llevel=listel,mylist);
fprintf(outfile,`%20s %8d %9d %20s %9d %9d %13e %13e\n`,
mylist[llevel,lda],mylist[llevel,c],mylist[llevel,d],mylist[nops(mylist),lda],
lcvalue,ldvalue,evalf(2/ldvalue),mylist[llevel,densitysum]);
od;
end;
The csetidea program sets up the environment and makes the initial call to csetcurs.
csetidea := proc(mxl) global lda,depth,c,d,densitysum,mylist,outfile;
outfile:=fopen(`\\maplev4\\mywork\\collatz\\csetidea.lst`, APPEND);
lda:=1; depth:=2; c:=3; d:=4; densitysum:=5; mylist := [];
fprintf(outfile,`%20s %8s %11s %22s %7s %8s %8s %11s \n`,
"modified lda","cdown","ddown","lda at leaf","cup","dup","adds in","making");
csetcurs("e", 5, 8, 1, mxl);
fclose(`\\maplev4\\mywork\\collatz\\csetidea.lst`)
end;
And the whole program is begun with a simple invocation, where the parameter defines the maximum number of levels of penetration into the abstract predecessor tree.
csetidea(7);
In csetidea, there must be an independent data structure because the whole stack of tree nodes traversed prior to an encounter of the leaf node must have their density summation maintained throughout. Each leaf node encountered implies an initial density of integers in its congruence set plus an appropriate increment using the densities in all the superior nodes in the depth first search. Indexing into this stack beginning at the recursion level reached in csetcurs uses the local variable llevel in procedure densout.
A simple recursive procedure was used in csetidea, thus tracking its depth of progress into the tree structure. A parameter level passed to densout was used to keep it tracking the depth in the data structure. Returning from a level of recursion in csetcurs returns control to the level above in the abstract tree, so there is no independent data structure or variable needed to coordinate the progress in csetidea with the depth in the data structure. The data structure is kept synchronized with the lateral progress in the predecessor tree by the use of left descent assembly identifiers internal to it.
When a particular recursive level in csetcurs must end, either because it reached the arbitrary limit set for the computer run or because all paths from a level have been explored, it suffices to RETURN from the recusive invocation, and all is taken care of. This works even when a series of consecutive levels in recursion must be popped (possibly the first because of exceeding the program limits and the rest due to completion of exploration of the possible paths from the starting level).
In csetidea, there must be an independent data structure because the whole stack of tree nodes traversed prior to an encounter of the leaf node must have their density summation maintained throughout. Each leaf node encountered implies an initial density of integers in its congruence set plus an appropriate increment using the densities in all the superior nodes in the depth first search. Indexing into this stack beginning at the recursion level reached in csetcurs uses the local variable llevel in procedure densout.
The data structure selected to represent the levels in the current pathway back to the root is a list of lists called mylist. At each level of mylist, the variables recorded are the left descent assembly name, the depth of the node, the c value of the congruence set, the d value of the congruence set, and the density sum of the congruence set to date. The number of lists in the list of lists is to be maintained equal to the depth of recursion achieved in csetcurs. This all seems to work mechanically, but the values accumulated in the density sum for each congruence set seem consistently to come out unreasonably large.
One key question is how (or whether) to adjust the density increments to be added into the density sum in the upward elements during development of each path in the abstract binary tree. In this program, which works at the congruence set level, each set contains only a single congruent integer, so it appears that 1/d is the appropriate density measure. On the way down, the 1/d density measure suffices since it tracks correctly the density of the odd numbers in each congruence set of the path as it is developed. On the way back up the d-values often undergo change to higher values, whence the density they contribute to the density sum is lessened. These values are used to modify the accumulated densitysum at llevel as densout does its upwards traversal.
It is almost impossible to make sense when talking about the output from csetidea. The program is driven primarily in the predecessor order, i.e. it starts at 5[8] and proceeds to explore all congruence set predecessors in turn until it reaches some arbitrary preset limit which prevents the program from running forever. Whenever a leaf set is encountered, a new left descent assembly has been found, and further exploration of the predecessor tree is postponed while the properties of the left descent assembly are examined, partially recorded, and output for inspection. The output proceeds from the leaf node back to the root node, i.e. in the successor direction, but the output is arranged in columns, some of which appear to be in predecessor order and others of which appear to be in successor order. Since the accustomed presentation of a tree in computer science is with the root at the top with every node's children hanging below it, this view seems most natural to me. But the tree is a predecessor tree from the viewpoint of the Collatz trajectory, so the tracing of a Collatz trajectory starts internally within the tree and follows upwards along the paths to the root.
But it is worse. The stack used to record the key information about each element of each left descent assembly visited would customarily be built from the bottom upwards, which would result in the direction of the traced paths in the stack being in the predecessor diection, exactly contrary to the direction of tracing those same paths in the predecessor tree where an upward trace is in the successor direction. Of course, the orientation of the tree and of the stack are merely common conventions which happen to conflict in this instance. And still worse, the output from csetidea needs to use both conventions in order that the congruence sets as they exist in both the stack and the paths of the tree are displayed.
Here is a very small sample of the output from csetidea. A great deal is going on here, so some explanatory notes are given below. The unexpected numeric results presumably (at least partially) arise from instances like those in the following, which are the first 12 lines from csetidea.lst. The output has been embellished with "!" and "*" characters to point out two(?) problem(s) which this approach encountered. Perhaps other similar but distinct features contribute similarly.
modified lda cdown ddown lda at leaf cup dup adds in making
es 3 16 es 3 16 6.250000e-02 6.250000e-02
e 5 8 ! es 5 24 4.166667e-02 1.666667e-01
esbb 33 256 * esbb 33 256 3.906250e-03 3.906250e-03
esb 25 64 ** esbb 25 192 5.208333e-03 2.083333e-02
es 3 16 *** esbb 19 144 6.944444e-03 6.944444e-02
e 5 8 ! esbb 29 216 4.629630e-03 1.712963e-01
esbbbb 513 4096 esbbbb 513 4096 2.441406e-04 2.441406e-04
esbbb 385 1024 esbbbb 385 3072 3.255208e-04 1.302083e-03
esbb 33 256 * esbbbb 289 2304 4.340278e-04 4.340278e-03
esb 25 64 ** esbbbb 217 1728 5.787037e-04 2.141204e-02
es 3 16 *** esbbbb 163 1296 7.716049e-04 7.021605e-02
e 5 8 ! esbbbb 245 1944 5.144033e-04 1.718107e-01
Note in passing how the program works for the left descent assembly "esbbbb" (last six lines). The path from "e" down (in the abstract tree) to "esbbbb" was created and its elements stacked. Then at the leaf node the path is traversed (upward in the abstract tree) with calculation of the c[d] for each congruence set. The density sums are calculated and the congruence sets' descriptive information for each visit are printed in parallel on a single line. The accuracy of the c[d] calculations and the 1/d calculations may easily be hand-checked in this output layout. The csetidea output also provides opportunity to check that the congruence sets attributed to each level in the upward traversal (cup and dup) and therefore to be properly assigned as contributers to the integer density in that node of the abstract tree in the downward traversal (cdown and ddown) are indeed subsets of the congruence sets which occupy that level. . For this purpose the c-value for the congruence sets during both the upward- and downward- developments is maintained and output as part of the work of densout. Checking the validity of the output against this requirement eliminates at least this potential source of error.
In this output read the cup column upwards for the series of Collatz predecessors of 5 in the "downward" path (in the predecessor tree) while for the "upward" (Collatz successor path in the predecessor tree) read downward in the cdown column.
We see (looking at the sequence after "esbbbb") 5, 3, 25, 33, 385, 513, are the predecessors of 5, while 513, 385, 289, 217, 163, 245 are the successors of 513. The early elements in the "upward" path are identical with the elements in the "downward" path at the same level. Thus it is not until 289[768] is reached that fresh increments to the density sum are observed.
The lines marked with "*" are ones which indicate some duplicative assignment of density into some congruence sets which are visited in the upward paths from certain (perhaps all?) discovered left descent assemblies. When "esbb" is first visited at its discovery as a leaf node, 33[256] is assigned a density of 1/256. When it is again visited in the upward path from "esbbbb" its density is augmented by 1/2304 which means that 1/9 of its integers are recounted. The same pattern of duplication is noted for 25[64] and 3[16] and countlessly more congruence sets (not shown in this sample).
Also where cdown[ddown] differs from cup[dup] (marked with "!", as in the second 'es' line and the third 'e' line), the latter is a subset of the former. Thus, a duplicate accounting is committed whenever such situations arise.
In spite of the shortcomings of this program with regard to accurately accumulating the density sums of the left descent assemblies, it is useful in that it provides in detail the inital instances of the congruence sets which constitute the abstract tree. The output gives the impression that those subsets to be added in as "fill-ins" are amply (even excessively) provided. The copious output from csetprog, replete with references to the numerous subsets of a parental congruence set, provides a feeling of correctness of the mechanism by which the congruence sets are completely filled.
Attempts to minimize over-accounting of the density sums contributing to each congruence set visited within the csetidea program did not succeed.
To overcome the problem of over-estimation, a rather elaborate post-processing procedure was developed. First a SPITBOL program ran against a copy of csetidea.lst which had been sorted on the c[d] values (i.e. cdown, ddown, cup, and dup) to eliminate second and subsequent records sharing cdown, ddown, and cup values, and the resulting smaller file run against a program to sum the individual density contributions of the now unique congruence sets to determine the total densities contributed both in the downward and the upward direction of the abstract tree development. The result was an under-estimate of the densities expected for the densities of integers in the congruence sets. It would seem that an intermediate course is required in which the instances of multiple visits to a given congruence set must be allocated a carefully calculated density increment. I have not pursued this approach.
Now, as to the challenge this project represents to the computer. To accurately accumulate the "fill-in" densities for each node of the abstract predecessor tree, all the nodes dependent from each one would have to be visited and their contributions added in. This is obviously impractical because the tree is infinitely deep. One must settle for the approximation which results if the tree growth is limited by the magnitude of numbers reached or by the depth of the deepest level reached (or both). Of course, the width of the abstract tree increases as the the allowed depth increases, so the time of the calculation goes up as the square of the depth penetrated. If a deep penetration is required, extra precision will be required to account accurately for the tiny density contributions of the deeply situated nodes. This, too, would increase the time dependence of the program.
Potentially, one might divide up the traversal of the abstract tree and divvy out the separated chunks for calculation by members of a team contributing to the overall project. I would hesitate to undertake any such task myself, because the absolute precision required by the lowest congruence sets at the bottom of the tree will always loom just out of reach, and the amount of information to be transmitted from each chunk of the calculation would be truly overwhelming.
The skeptic might suggest that during the determination of the densitysum in each congruence set, some loss of congruence sets (or their members) could occur. However, the integers identified in the leaf set at the bottom of the downward stage of abstract binary tree development are each one individually destined to follow their defined paths back up through the successive congruence sets identified in the full tree development. Neither integers nor congruence sets are lost or inserted into any path created. Likewise, the steps used in the downward process (predecessor steps) are in the precise inverse order to that used in the upward process (successor steps) so density changes can be due only to development of sparser and sparser congruence sets. So, the discrepencies in the densitysum are to be ascribed to incorrect accounting, not to loss or addition of congruence sets or any of their elements in the abstract tree.