Users of browsers with incomplete coverage of mathematical symbols will want to know:
≡ represents equivalence (===)
∈ represents set membership (in)
⇐ represents <== (produced by)

Plan of Attack for a Proof of the Collatz Conjecture

Most readers of these pages,  but especially those with a mathematical bent,  find themselves overwhelmed by the volume of textual material,  and rapidly lose interest.  This page is an attempt to focus on the small fraction of this work which I believe suffices for the basis for a proof.   Even those segments presented here are limited to minimal treatment with liberal reference to the more discursive material written previously.  Hopefully,  this skeletal treatment will pique the interest of readers to look,  as needed,  at the detail which is provided through the references.  Readers hip to computer science will need to visit relatively few references,  but readers unfamiliar with computer science may need to visit lots of references.

Accordingly,  I will focus here on only four aspects of the Collatz problem:
a) develop the state transition diagram -- develop the valid Collatz trajectory and predecessor transitions,   then
b) build the abstract predecessor tree using the allowed transitions,  then
c) do the combinatorics and the densities to show that all odd integers are included in the predecessor tree,  then
d) do the inductive constructive predecessor tree build of the general predecessor tree to nail the proof.

This page describes a proof based on the state transition diagram for the allowed transitions of the Collatz trajectory.  Mensanator has objected that the state transition diagram includes behavior of only the positive integers,  insisting that the negative integers should be accomodated also.  Since the negative integers do introduce a loop in Collatz-like trajectories, he concludes that such a proof cannot be satisfactory.  I would object that since the Collatz conjecture applies only to positive integers,  it is perfectly appropriate to base a proof on the behavior of the positive integers.

Mensanator's objections did,  however,  help prompt me to develop an alternative approach based on a compete catalog of the residue sets which appear in the abstract predecessor tree  (up to program defined limits which control what would otherwise be an infiite process).


The Collatz conjecture has attracted hundreds of interested workers during the seventy years since its proposal but is still without a generally accepted proof.  The conjecture involves a recursive iteration on the positive integers and proposes that the trajectories of these iterations are such that convergence to 1 will result from any starting value. We need only prove that the Colltz graph is comprehensive and weakly connected.

It appears the conjecture could be proven easily if one could just develop a structure which accommodates the peculiar behavior of its trajectories.   This work develops and then proceeds from just such a structure.

State Transition Diagram for the Collatz Iteration

Def'n: {Ta} (i.e. a set of transition results in the abstract tree)  represents a set of trajectory elements congruent to some {c[d]}, ultimately {Ta}===c(mod d).

By regarding the Collatz iteration as a finite state machine, a directed graph called a state transition diagram is readily constructed.  It shows the result of each Collatz transition as an edge from each {Ta} to its immediate predecessor.  The values are characterized in terms of their residue sets {Ta}[3] and {Ta}[8].

Note: the predecessor direction is from a node to its Collatz predecessor,  eqn 2,  opposite to that of the Collatz trajectory,  eqn 1.

Note: The Collatz predecessor tree is considered to consist solely of odd integers ({Ta} in {1[2]}).  The even numbers will be included later  (almost as an afterthought)  as the powers-of-two multiples of the odd numbers, i.e.  {Ta in {0[2]}} produced from 2k *{Ta in {1[2]},k in N}.

With the observations gleaned from the state transition diagram it is a small step to develop an abstract predecessor tree.

Def'n: An l.d.a.  (left descent assembly, so-named because it is a left descent in the related binary predecessor tree) is a path from the root of the abstract predecessor tree (i.e. {Ta in {5[8]}) to a leaf set (i.e. {Ta in {0[3]}), inclusive of the root, internal nodes, and leaf node.  {Ta} in {1[3]} produce predecessors bigger than themselves,  and {Ta} in {2[3]} produce predecessors smaller than themselves,  whence we have formed the habit of calling the former b-steps and the latter s-steps.  In eqn 2  i in {1,2} in l.d.a.s., thus limiting the number of successive divisions by 2 in an l.d.a. portion of a Collatz trajectory.

Internal nodes in l.d.a.s {Ta which are subsets of {k[8]},k in {1,3,7}) are composed of two subsets  ({1[3]} and {2[3]})  which lead via b- and s-steps to two predecessor sets,   as controlled by the value of floor(mod(Ta,72)/24).   Only the header and internal nodes are included in the state transition diagram because the leaf nodes have no predecessors and cannot contribute to elaboration of an l.d.a.

{Ta} in {5[8]} have no incoming edges. All the other nodes accept incoming edges and can chain together to form a possibly lengthy l.d.a.  It is this separation of capabilities which justifies the selection of {5[8]} as the root of the abstract predecessor tree,  and the complementary sets {{1[8]}, {3[8]}, {7[8]}} as the internal nodes of  l.d.a.s.

A rearranged state transition diagram showing the linkage between several  l.d.a.s is offered and discussed briefly.  In that diagram the same internal nodes are shown, but leaf nodes and extensions are shown as collected by large curly braces which explicitly provide entries into deeper l.d.a.s.   The extensions, {Ta in {5[8]}}, come from eqn 2 with i>2.

To summarize,  the binary predecessor tree is composed of  l.d.a.s linked together by extensions which serve as headers for another generation of  l.d.a.s.   Every odd number has an extension,  but only those extensions in {1[3]} or {2[3]} appear as headers of additional l.d.a.s.  The extensions in both {0[3]} and {5[8]} are leaves and  l.d.a. headers.  In other words,  these are null l.d.a.s containing no internal nodes. .

The Abstract Predecessor Tree

The abstract tree is a tree whose nodes are infinite sets of odd integers {c[d],  i.e. congruent to c(mod d)}.  We often refer to (and manipulate) these sets using formulas {dn+c, and d in {2a*3b},[a,b] in Z+,n in N}, reflecting the means of their derivation (below).

In addition to the abstract tree view of the Collatz predecessor tree,  we have also pictured general and binary views of it.  The general tree is valuable because it faithfully depicts all the Collatz trajectories from deep in the tree back to the root node at 1.  The binary tree is more useful in following the dissection of the tree into l.d.a.s and extensions.  While useful for following the trajectories, these views contribute little to a proof of the Collatz conjecture.

Any set in the abstract tree other than a leaf node set can give rise to a predecessor set by either an s- or a b- step.  This immediately produces the abstract tree  (see two views there).  Every path from root to a terminating leaf set in the abstract tree represents an l.d.a.  L.d.a.s can be denoted by a string "e(s|b)kt", where the e represents the extension header,  the parenthesized expression indicates k ( in N) selections in any order of s- or b-steps,  and the t represents termination due to having reached a leaf set.

This expression denoting the detailed structure of an l.d.a. is in predecessor order.  Read left to right to follow the growth of the predecessor tree.  Read right to left to follow a path segment in the direction of the Collatz trajectory.  We will most often use the predecessor direction for l.d.a.s and other path segments.

As is evident in a brief schematic of abstract tree development all possible l.d.a.s are formed by systematic use of both s- and b-steps from the root and then each successively developed node.

We need now to develop the {c[d]} residue class for every element of every l.d.a..  For example,  the root node {5[8]} in the abstract tree includes as subsets the particular {c[d]} residue class for the root of every l.d.a., and similarly for all the other nodes in the abstract tree.  When eq2 is employed to produce each node's predecessors,  a leaf set is identified as well as two predecessor sets.  The leaf node {c[d]} value is precisely that describing the leaf set resulting from the l.d.a. traced to that point.  Now,  applying eq1 through the successor (upward) pathway,  the resulting series of {c[d]} values will represent those of the individual sets of the l.d.a.  traced,  always subsets of the values observed in the predecessor direction, and concluding with the particular subset of {5[8]} which represents the header node of the l.d.a.  traced.

The process is pictured schematically ,  and can be accomplished by hand in a down- and up again- process using the dn+c formula or by the program treegrow.  Two illustrative subsets (complete and headers only) of the formulas for l.d.a.  elements are provided.  Margaret Maxfield has provided a proof that all these residue classes are unique.   Note that there has been a subtle transformation effected by the determination of residue classes for the nodes in the l.d.a.s in the abstract tree.  The abstract tree nodes which were first described as members ( in ) of all the sets in this path to leaf nodes (there or deeper) have been converted to an identity (===) describing the residue set of that single set representing the leaf node of the l.d.a. which culminates there.

Since every element of every l.d.a. is a unique residue set {Ta===c[d]} we may turn to the task of  "counting" the integers in that infinite set (of l.d.a.s) of infinite sets (instantiations over k in N of each particular l.d.a.).

Combinatorics, the Density Metric, Infinite Summations

To begin with,  cardinality is not the correct metric to use to determine if a set of sets of odd integers contains all of them.  I suggest herein that density is the appropriate metric.

I'll reference and skip over a few infinite summations of set densities built directly on the abstract predecessor tree which indicate that the total density of the sets of odd numbers created by it is 1,  i.e.  the totality of them.  I believe them to be accurate and meaningful,  but they are based on a much less detailed analysis than that of the next paragraph,  and so might be less convincing.

We'll take advantage of a picture worth well over a thousand words.  One simple picture reveals the combinatorics which lead to the binomial pattern of the frequencies of occurrence of  l.d.a.s sharing a common (a,b) of the 2a*3b expression which is the d of c[d],  the use of an "isobar" to attach together a common power-of-2 depth into the abstract tree, and the appearance of the Fibonacci series as a result.  This leads quickly into a summation of the densities of the odd integers across the whole infinite abstract predecessor tree.  This summation  indicates that the abstract tree contains all the odd integers, with the boundaries carefully aligned .  ( Taking the limit of the uncounted odd integers as a goes to infinity confirms this result.)

The Constructive Inductive Proof

Two illustrations have been prepared to help the reader accustom himself to the nodes available for Collatz graph construction and inspect their properties.  The first presents isolated nodes without putting a context around them,  and the second shows them in the context of a schematic binary predecessor tree.  These pictures do a better job of characterizing the situation than I seem able to produce verbally.

Let's make an inductive proof by construction that the graph must be a binary tree.  Start with a two-node binary tree consisting of the root and its first extension.    It is a tree, subsuming whatever problems are associated with the unique root.  This tree has 2 unsatisfied edges.  We'll take this two-node tree as the basis for induction by repeated addition of an appropriate new node.

Now there are two cases available for building the tree.  Either an l.d.a.  element can be added as a left child or an extension element can be added as a right child. 

Case a: If an l.d.a. element is added,  a tree will result and the resulting tree will either have the same number of unsatisfied edges (if the l.d.a. element is a leaf node),  or an addition of one unsatisfied edge (if the l.d.a.  element is an internal one). 

Case b: If an extension element is added as a right child, a tree is again obtained,  with no increased number of unsatisifed edges (if the extension heads a null l.d.a.),  or the addition of one unsatisfied edge (if the extension is not a leaf). 

In either case a binary tree results,  and the resulting tree will contain at least as many unsatisfied edges as the starting one,  but often one more.  The number of unsatisfied edges monotonically increases as additional nodes are added. At any stage of this inductive tree construction, a tree is the point of departure, and a tree results from the addition of a new node.

That completes the induction.  We conclude that the Collatz predecessor graph can only be an infinite binary tree.  That the tree is binary is a stronger result than we need.  Any sort of tree is weakly connected,  and would suffice to prove the Collatz conjecture.

If that all seems too easy, consult a sort of reprise and a more comprehensive kind of theorem,  though that theorem is unproven  (who needs it now?).

My Collatz Home Page        Index to Terms Used