Trails and Graph Theory: The Series

Landing page for the entire series of posts on graph theory and trails.

Trails and Graph Theory 10: Databook

Long distance hikers, especially old-school hikers before hiking apps, have been accustomed to rely upon “databooks“, a brief text summary of the important intersections, water sources, features, and distances of a long hike. Here is a portion of a page from the Appalachian Trail 1987 databook, a slim pocket-size saddle-stapled booklet.

And this is an image from a portion of the databook for the Northern New Mexico Loop (by Blisterfree, the creator of the Grand Enchantment Trail).

Not all backpackers use a databook during their hike, even pre-app, but it is still useful for planning prior to the trip. Some hikers do find an advantage in reading a text summary on trail, rather than squinting at a topo map representation on their smartphone screen. Hike your own hike; your mileage may vary; use tools and do not let them use you.

In our previous post, we were able to find a long loop route in the Gila National Forest (that does not repeat any segment), in the form of a numeric list of nodes. Our graph imported via OSMnx from OpenStreetMap includes extra information as node attributes and edge attributes, that we can extract in a Pythony way to make our own databook of sorts. Here is an example of the text information in one edge of our graph.

{'osmid': 243511168, 'ref': '165', 'name': 'Pine Flat Trail #165', 'highway': 'path', 'oneway': False, 'reversed': [False, True], 'length': 4374.8279999999995, 'from': [2509057410, 2509594627, 2509057291, 2509594252, 2509057424, 2509594385, 2509594643, 2509594263, 2509594657, 2509594529, 2509057314, 2509594279, 2509594664, 2509594284, 2509594543, 2509594673, 2509594294, 2509057334, 2509594680, 2509594556, 2509594176, 2509594306, 2509594691, 2509594435, 2509057350, 2509594568,
<truncated>

From that information we can generate a summary from our Euler graph.

  MILE   ELEVATION  TRAIL                         INTERSECTION                      
...
  105.0  6509       Hells Hole Trail #268         West Fork Trail #151          
  105.5  6529       Trail #785                    Hells Hole Trail #268         
  106.0  6483       West Fork Trail #151          Trail #785                    
  109.7  6834       Turkey Creek Trail #155       West Fork Trail #151          
  111.8  7851       Mogollon Creek Trail #153     Turkey Creek Trail #155     
...

The Python code is fairly short (excluding the altitude part, which is not included in our OSM data). The column format is enabled by the library columnar, exceedingly easy to use.

def report_path(Q, txt='',freedom_units=False,lat_long=False):
    meters_to_mile = 0.0006213712
    meters_to_feet = 3.28084
    meters_to_km = 0.001
    
    if nx.is_eulerian(Q):
        total_length = 0
        data = []
        for u,v,k in nx.eulerian_circuit(Q, keys=True):
             data_line = []
             if freedom_units:
                 data_line.append(str(round(total_length*meters_to_mile,1)))
             else:
                 data_line.append(str(round(total_length*meters_to_km,1)))
                 
             d = Q.get_edge_data(u,v)[k]
             length=0                 
             if 'length' in d:
                 length = d['length'] #meters
                 total_length += length
             node_attr = Q.nodes(data=True)[u]
             long = node_attr['x']
             lat = node_attr['y']
             if lat_long:
                 data_line.append(lat)
                 data_line_append(long)
             elevation = get_elevation(lat,long) #in meters
             elevation *= meters_to_feet #freedom units
             if freedom_units:
                 data_line.append(str(round(elevation*meters_to_feet)))
             else:
                 data_line.append(str(round(elevation)))

             trail_name=''
             if 'name' in d:
                 trail_name = d['name']
                 if not isinstance(trail_name, str):
                     trail_name = ','.join(trail_name) #in case is a list
             data_line.append(trail_name)                     
             
             crossing = set()
             crossing_txt = ''
             for u,neightbor,k,d in Q.edges(u,data=True, keys=True):
                 if 'name' in d:
                     name = d['name']
                     if not isinstance(name, str): #FIX with a foreach
                         name = ','.join(name) #in case is a list
                     if name != trail_name:
                         crossing.add(name)
             if len(crossing)&gt;0:
                 crossing_txt = ','.join(crossing)
             data_line.append(crossing_txt)
                              
             data.append(data_line)                 
        header = []
        if freedom_units:
            header.append('MILE')
        else:
            header.append('KM')
        if lat_long:
            header.append('LAT')
            header.append('LONG')
        header.append('ELEVATION')
        header.append('TRAIL')
        header.append('INTERSECTION')
        table = columnar(data, header, no_borders=True)
        print(table)     
    else:
        print('The graph is not eulerian')

One alternative, that only occurred to me after I had written this, is to output the route as CSV (comma separated values), import into a spreadsheet, and make any changes to presentation and format within the spreadsheet, and eventually export as PDF.

We still do not have water sources, towns, fence or road crossings, etc. I will go off and have a long think on how to include those features automagically.

Download source code here.

Related Posts:

Trails and Graph Theory 9: Gradient Fuzz

In our last episode we added random “noise” to the length values of each segment of our trail graph, and for each iteration of random values we did our normal routine to make an Euler graph, and therefore find a long hiking loop without repeating any trail sections:

  • networkx.algorithms.min_weight_matching() on the fuzzed length odd-nodes graph
  • Delete trail segments between matched nodes to make an all-even node graph
  • Restore 4-nodes that we saved previously
  • Take the largest subgraph, since deleting bits of trails may create disconnected islands
  • If the graph is Eulerian, see if the total length is larger than anything we have tried before (using the actual length values, not the fuzzed lengths).

With our fuzzing routine, we increased our best try at an Euler graph from an initial solution of 154 miles to a respectable 228 miles after 200 or so iterations.

Now, let us alter our approach, doing a technique I am calling “gradient fuzz”. We fuzz our original graph like before, but if the resulting solution is the longest, we save the fuzzed graph as our next input, and fuzz the fuzzed graph instead of fuzzing our original graph.

In other words, if we find a better solution, try to improve that solution instead of continuing to try to improve the original input.

The code needs only minor changes.

def fuzz_min_weight_matching(Q):
    T = nx.Graph()
    for u, v, d in Q.edges(data=True):
        len = d['length']
        len = len * (1 + random.random())
        T.add_edge(u,v,length=len)
    return T, nx.min_weight_matching(T,weight='length')

max_fuzzed_Graph = None
while j&lt;1000:
    if max_fuzzed_Graph == None:
        fuzzed_Graph, matching = 
            fuzz_min_weight_matching(K)
    else:
        fuzzed_Graph, matching = 
           fuzz_min_weight_matching(max_fuzzed_Graph)
    j+=1
    T = K.copy()
    T.remove_edges_from(matching)
    restore_4node_graph(T,fix_4_list)
    nodes = max(nx.connected_components(T), key=len)
    T2 = nx.subgraph(T, nodes).copy()
    if nx.is_eulerian(T2):
        all_matchings.append(matching)
        print('found eulerian on fuzz attempt ', j)
        index = len(all_matchings)-1
        print('length of successful matching: ', len(matching))
        print('matching ', index,  ' is eulerian')
        report(T2,'matching ' + str(index))
        length = graph_length(T2)
        save_png(T2,max_T2,bounds,index)                                               
        if length > max_length:
            max_length = length
            max_T2 = T2.copy()
            max_fuzzed_Graph = fuzzed_Graph.copy()

Results after 1000 iterations:

...

current max length:  270
found eulerian on fuzz attempt  9999
length of successful matching:  68
matching  834  is eulerian
65  nodes  matching 834
66  edges  matching 834
total length 189.19 miles  matching 834
number of odd intersections matching 834 : 0
number of even intersections matching 834 : 65
number of 2-way intersections matching 834 : 64
number of trailhead/terminus matching 834 : 0

current max length:  270
found eulerian on fuzz attempt  1000
length of successful matching:  68
matching  835  is eulerian
65  nodes  matching 835
66  edges  matching 835
total length 161.67 miles  matching 835
number of odd intersections matching 835 : 0
number of even intersections matching 835 : 65
number of 2-way intersections matching 835 : 64
number of trailhead/terminus matching 835 : 0

current max length:  270
1:33:15
finished finding some possible matchings
max length =  270.4589642970273  miles
1:33:16
finished script max_trail3e.py

The color schemes chosen for the output graphs have been tweaked from the last article to be a bit easier to follow. (I am still having trouble finding a good combinations of colors and styles, as the current graph and maximum graph often overlap for much of their length.)

We have improved the maximum known circuit length from 228 to 270 miles, with modest effort.

Download source code here.

Related Posts: