Producing "Designer" Path Segments

Why Bother?

We know that the Collatz predecessor tree is an infinite tree. We know that the abstract predecessor tree contains an infinite number of l.d.a.s.,  and that each l.d.a. has an infinite number of instantiations.  Feinstein has offered a proof that the infinite nature of these structures implies that the Collatz conjecture is unprovable.  I argue elsewhere in these pages that the proper metric for infinity in discussion of the Collatz conjecture is density,  i.e.  the number of integers appearing in some set within its defined range divided by the capacity of that range were it completely filled.  The summation of the density contributions of all the sets representing all the elements of all the instantiations of all the l.d.a.s in the abstract tree is 1.  Since we are necessarily limited by the finite practicalities,  we can surely say that the limit of the sum of all the densities approaches 1 as we include more and more larger and larger l.d.a.s in the abstract tree. While perhaps not a proof in itself,  this is certainly a property required if the conjecture is true.

So some feeling needs to be developed about those very large l.d.a.s in the lower reaches of the abstract tree. What are they like?  What order of magnitude are we talking about?  Can we visualize the definition of an l.d.a.  (or,  by extension,  some sort of elaboration of l.d.a.s, say sequence vectors)  which provides a clear cut implication of an asymptotic approach toinfinity?  Can we get a concrete picture of such a thing?  We may not answer these questions,  but it makes a very nice playground.

The Makepath Program

By virtue of its capability of producing any/all predecessor paths in the Collatz tree,  nbigpath indicates that "custom" or "designer" paths should easily be generated.  By specifying a particular path to be developed,  the problems of dealing with the whole Collatz predecessor tree (an infinite object)  are avoided by focusing entirely on any one particular trajectory at a time. 

Makepath has an error or a cute device in it  (depending on your viewpoint)  which lets it build a path from some root integer through a series of  (nominal, but usually strictly incorrect)  Collatz predecessor calculations to follow   (or create)  a defined sequence of Collatz steps.

On the one hand,  this is an impossible task.  A path which follows a prescribed arbitrary sequence of steps started at an arbitrary integer high in the predecessor tree and working outwards through predecessors cannot succeed.  Evidence for this is seen in the sample of output from nbigpath where it is quickly evident that only every other digit appears in the available branches from any branch point.  When a path reaches a leaf node,  there can be no further predecessors,  and for non-leaf nodes,  the next step is fully specified by its value mod 3,  resulting in the existence of only one,  not two,  next steps.  Thus,  only half   (or 1/2^nth)  of paths can exist.

On the other hand,  makepath performs something very like that impossible trick.  It produces a large integer which,  used as departure value for a Collatz trajectory does produce the specified sequence of steps.  The value it arrives at after the requested trajectory is still a large integer which will often take a large number of Collatz steps to reach 1.  The error   (or cute device)  derives from the fact that makepath takes no pains to suppress a further step when a leaf node is encountered.  Makepath is also quite indifferent about the value of oldcoeff which is employed in the calculation of the next stepsize,  in no way insisting that oldcoeff has a value appropriate to previous history or to the next step.  The program does enter the code fragment appropriate to the requested step   (reduced to either 1  (s-step)  or 2   (b-step)  at the key point in the program),  but it determines the size of the step to take by including either one or two times the oldcoeff to achieve the appropriate value mod 3.

The initial values of d and c of the dn+c notation for the root of the subtree from which a path is to be found are accepted as the first two parameters.  The makepath program also accepts a pattern given as a sequence vector and the number of repetitions of that pattern for it to trace out.  The program proceeds with development of the consequent formulas (i.e. residue sets) by the methods of the downward portion of  l.d.a. formula development in the abstract tree.   The last integer value output gives an n value whose Collatz trajectory will trace  (now in the upward direction in the predecessor tree)  the entire requested l.d.a.  path  (whose length is reps*length(pattern))  followed by whatever additional steps are required to get to the root node  (1) from there.  Characteristically,  the requested path will represent only a relatively small first portion of the path makepath created, followed by a relatively long path to the root.

> makepath := proc(dstart,cstart,pattern,reps) local
>     alphabet,coeff,oldcoeff,const,i,j,currstep;
>     alphabet:=`123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz`;
>     fopen(`pathmade.lst`, WRITE,TEXT);
>     fopen(`madeskip.lst`,APPEND,TEXT);                         #instrumented
>     coeff := dstart;
>     const := cstart;
> #   oldcoeff := coeff/3;                                       #original stmt. pos'n
>       for i from 1 to reps do;
>          for j from 1 to length(pattern) do;
>          currstep := SearchText(substring(pattern,j..j),alphabet);
>             while currstep>2 do;
>                 coeff := 4* coeff;
>                 const := 4* const;
>                 currstep := currstep - 2;
>             end do;
>          oldcoeff := coeff/3;                                  #corrected stmt. pos'n
>          sdid :=``;                                            #instrumented
>             if const mod 3 = 0 then                            #instrumented
>                fprintf(`madeskip.lst`,`%20d %20d %20d %3d %3d`,#instrumented
>                coeff,const,oldcoeff,i,j);                      #instrumented
>                sdid := `??`;                                   #instrumented
>             end if;                                            #instrumented
> #check for source of s-step first
>             if currstep=1 then
> #look at children of const (i.e. c-value), itself
>                if const mod 3 = 2 then
>                   show := `s0`;                                #instrumented
>                   coeff := 2*coeff; const := 2/3*const - 1/3;
> # and then its 2nd and 3rd opportunities
>                elif (const + oldcoeff) mod 3 = 2 then
>                   show := `s1`;                                #instrumented
>                   coeff := 2*coeff; const := 2/3*const + 2/3*oldcoeff - 1/3;
>                elif (const + 2*oldcoeff) mod 3 = 2 then
>                   show := `s2`;                                #instrumented
>                   coeff := 2*coeff; const := 2/3*const + 4/3*oldcoeff - 1/3;
>                end if;
> #look for source of b-step now
>             elif currstep=2 then
> # all 3 opportunities, again
>                if const mod 3 = 1 then
>                   show := `b0`;                                #instrumented
>                   coeff := 4*coeff; const := 4/3*const - 1/3;
>                elif (const + oldcoeff) mod 3 = 1 then
>                   show := `b1`;                                #instrumented
>                   coeff := 4*coeff; const := 4/3*const + 4/3*oldcoeff - 1/3;
>                elif (const + 2*oldcoeff) mod 3 = 1 then
>                   show := `b2`;                                #instrumented
>                   coeff := 4*coeff; const := 4/3*const + 8/3*oldcoeff - 1/3;
>                end if;
>             end if;
>             if sdid<>`` then                                   #instrumented
>               fprintf(`madeskip.lst`,` %2s %20d\n`,show,const);#instrumented
>             end if;                                            #instrumented
>          fprintf(`pathmade.lst`,`at %2d,%2d %s %80d \n`,i,j,
>               convert(ifactor(coeff),string),const);
>          end do;
>       end do;
>    fclose(`pathmade.lst`);
>    fclose(`madeskip.lst`);                                     #instrumented
>    end proc;

The above rendition of the source code for makepath was consciously chosen to make it evident that the program reached its current state after some false steps.  Serious readers wishing to reproduce results mentioned here,  or to pursue these threads to greater lengths,  may need to explore a bit to encounter the exact program variation which exhibits a particular behavior.

The use of  (24,5)  as the first two arguments to makepath seems generally useful.  This parameter pair effectively starts at 5 with 8 as coeff due to the early oldcoeff := coeff/3; statement.  (The use of  (8,5) results in abnormal termination because 8/3 is not an integer.)  One special case uses  (3,1)  for the first 2 arguments.  This (d,c)  pair serves when the requested path is one of the output lines from nbigpath which produces paths rooted at 1. 

I'll make a brief diversion here to illustrate both the proper behavior from the root and the impropieties rapidly encountered.  Nbigpath's output includes a line giving the integer 153547 from `43211422142132242124421`.

Running npathtrk with n=153547 results in `43211422142132242124421`.

Running makepath(3,1,`43211422142132242124421`,1)  gives the correct integer sequence ending with 153547. Thus,  running npathtrk against that integer does indeed reproduce the path to it discovered by nbigpath,  validating both programs in a simple case. But the coefficients d produced are extremely large,  but harmlessly so,  in that first and only rep.  No output to madeskip.lst occurred in this run.

However,  asking for more reps gives extremely bloated integers in all larger reps.  Notably,  the instrumentation with output to madeskip.lst produced a large file indicating frequent encounters of leaf nodes,  including all four types of leaf nodes, in the second and subsequent reps.  Nevertheless,  the pattern of the path is faithfully reproduced.  So even in a case where rational behavior might be most expected,  unexpected things arise.  The following is the path reported by npathtrk after editing to divide the path string into chunks the size of the pattern.

    root at 1<---  2153
43211422142132242124421
43211422142132242124421
43211422142132242124421
43211422142132242124421
43211422142132242124421 <-- start at large intgeger produced by makepath

Despite the small number of those few rational guidelines for choice of its first 2 parameters,  the program is remarkably flexible,  operating successfully for current purposes from many  (24*2^i,5)  or even  (24*2^i,m) (m large odd integer)  values for the first two arguments.  The results of runs which might seem closely related don't always result in identical pathways.

It quickly proved necessary to have a program useful to assess the output from makepath.  Recall that even though makepath produces many lines of output, it is only the final large integer which is of interest,  as it is that integer which is to be run through the Collatz iteration to observe the details of the trajectory.  Although leaving the details of npathtrk for the next page of this site, we will refer to some aspects of its output in the following because makepath's output is really only understandable through npathtrk.

Specifically, the use of npathtrk allows rapid production of graphs representing paths of interest whose locations in the Collatz tree have been produced by makepath. 

The earliest application of makepath was to explore l.d.a.s (in which only 1 or 2 divisions by 2 are admitted).  In the following two paragraphs the s- and b- step notations are used.  Once the program was altered to handle "sequence vectors" instead of only l.d.a.s.,   the steps would be denoted as 1 or 2. Two such examples  (using the (24,5) arguments)  are recorded here,  that for a path segment consisting solely of s-steps,  and that for a path segment consisting solely of b-steps.

These plots may serve to clarify the individual steps in graphs where s- and b-steps are intermixed.  Note that the vertical lines in the s-step case extend from a line to a line immediately below it,  while in the b-stepcase,  the vertical line crosses two intervals.  This pictorially represents the origin of the s- and b-nomenclature to indicate small and big steps.   A real example (i.e. one with intermixed s- and b-steps) is also provided.

The program was quickly upgraded to allow for trajectories in which other than s- and/or b- steps occur.  This was accomplished by using an "extended hexadecimal" notation to represent the number of divisions by 2 between the successive odd integers in the trajectory. (See the program above for the detailed alphabet.)  Since the extended alphabet includes the letters "b" and "s" to represent the digits 35 and 52,  respectively,  in paths containing those numbers of consecutive divisions by 2,  it was necessary to use "1" and "2" to represent what we have previously called s- and b- steps.  It proved easy to develop uncomplicated graphs extending over a huge number of orders of magnitude using these notations,  e.g. one produced from a long series of '6s6' path segments.  This is but one of several examples where the first  (in predecessor order)  instance of the path requested is produced with an error in it.  Although many instances of this error were repaired by moving the oldcoeff := coeff/3; statement down 9 lines from its original location  (see the listing above), not all such errors were corrected.

Makepath has the remarkable capability of finding a correct path when the path it is requested to generate cannot be achieved following the Collatz predecessor rules.  It accomplishes this by making the usual increments  (in the statements #B through #C)  in the d value of the {c[d]} expression which it carries along to calculate the next predecessor from,  whether or not the const value  (in statement #A)  indicates the encounter of a leaf node.  This creation of a larger playground at each such encounter apparently allows correct selection of a suitable predecessor.  Of course,  this unwarranted increment in the d values causes the magnitude of the values in the reported path to be quite large,  often many fold larger than the smallest instance of the requested path segment occurs at.  [If the smallest possible example of a Collatz path with any given specified sequence vector is desired, the method used to determine the descent to 27 is recommended.]

The statements marked as " #instrumented" were inserted to trace the instances in which this process was applied by makepath.   This instrumentation may help resolve any further misbehavior,   some of which are thought to remain even in the current version of the code.

Several examples of much more complicated paths intended to reach deeply into larger magnitudes of numbers produced graphs of an artistic cast,  but of little use for development of insight into the Collatz conjecture.  

As to the contribution to the total density of the predecessor tree contributed by the examples,  simply bear in mind that each odd point reached contributes something a little less than 2/2^y, where y is the coordinate on the y-axis of each npathtrk plot  (because the d value of c[d] is always greater than the plotted c value).  Thus those density contributions are very small indeed.


My Collatz HomePage      Index of Terms Used