Quick & Dirty arcpy: Compare Feature Class Table Schemas

I’m in the process of rewriting a process, moving most of the processing from arcpy to postgresql-enabled python (love me some psycopg2).

One of the QC checks I’m doing at the end of this re-write is just verifying that the feature class schemas are the same (or that the differences are intended)  under the new process as they were in the old process.

And while ArcGIS does have a good tool for this, there were a couple tweaks I wanted to make. Most notably, I wanted a list of fields that are not in both feature classes.

ArcGIS Table Compare

So I made a quick & dirty script to do that, nothing especially clever but I’ve found it useful. Download it from GitHub. I have it currently set up to work on feature layers but you should be able to change the toolbox parameter types to allow feature classes or tables.

import arcpy,sys,os

def printit(inMessage):
    print inMessage
    arcpy.AddMessage(inMessage)

featureclass1 = sys.argv[1]
featureclass2 = sys.argv[2]

tableheaders = 'name, type, width, precision, domain'

def makeFieldDict(inFC):
    d = arcpy.Describe(inFC)
    printit("Dataset: "+d.baseName)
    printit("Type: "+d.dataType)
    printit("Path: "+d.catalogPath)
    printit(" ")

    lFields=arcpy.ListFields(inFC)

    printit (tableheaders)
    fieldDict = dict()
    printit (lFields)
    for lf in lFields:
        fieldDict[lf.name] = [lf.name,lf.type,lf.length,lf.precision,lf.domain]
        printit (lf.name+", "+lf.type +", "+str(lf.length)+", "+str(lf.precision)+", "+lf.domain)
    return fieldDict

fieldDict1 = makeFieldDict(featureclass1)
fieldDict2 = makeFieldDict(featureclass2)
errorList = []
printit(" ")
printit(" ")
printit("Comparing Fields:")
for iField in sorted(list(set(fieldDict1.keys()+fieldDict2.keys()))):
    if not (fieldDict1.has_key(iField)):
        theResult = " {0} not found in {1}".format(iField,featureclass1)
        errorList.append(theResult)
    elif not (fieldDict2.has_key(iField)):
        theResult = " {0} not found in {1}".format(iField,featureclass2)
        errorList.append(theResult)
    else:
        if (fieldDict1[iField] == fieldDict2[iField]):
            theResult = " {0} OK".format(iField)
        else:
            theResult = " {0} Have Different Definitions \n   {1}: {2}\n   {3}: {4}".format(iField,featureclass1,fieldDict1[iField],featureclass2,fieldDict2[iField])
            errorList.append(theResult)

    printit( theResult )

printit(" ")
printit(" ")
if len(errorList) == 0:
    printit("GOOD! No difference Found!")
else:
    printit("These Differences Found:")
    for iError in errorList:
        printit(iError)

printit("Done!")

Garmin GPX to Shapefile (SHP) conversion GPX2Shp.py

I mentioned using Tapiriik to batch download my entire Garmin Connect history–over 1,000 separate .GPX files. I found several tools to convert .GPX to shapefiles that worked but none seemed to recognize my heart rate data.

The trick is Garmin extends the GPX specification to incorporate the heart rate:

xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"

Each track point looks like this:

     <trkpt lat=”43.68346489146352″ lon=”-92.99583793468773″>
        
        <ele>296.20001220703125
        <extensions>
          <gpxtpx:TrackPointExtension>
            <gpxtpx:hr>86
          gpxtpx:TrackPointExtension>
        </extensions>
      trkpt>

 

Since the first few exiting GPX converters failed to meet my needs, I decided to make my own, at least partially.

I used Joel Lawhead of GeospatialPython.com‘s pyshp library to handle writing the shapefile. I added some basic loop and I stuck a template.prj in the workspace that gets copied once for each shapefile.

Otherwise, nothing too fancy going on.  The code can be downloaded from Github.

import glob, os
import xml.etree.ElementTree as ET
import shapefile
import shutil

theNS = "{http://www.topografix.com/GPX/1/1}".lower()
theNS2 = "{http://www.garmin.com/xmlschemas/TrackPointExtension/v1}".lower()
templatePRJfile = "template.prj"

def elementIs(inElement,inTag):
    item1 = inTag.lower()
    item2 = elementName(inElement)
    return (inTag.lower() == elementName(inElement).lower())

def elementName(inElement):
    item1= inElement.tag.lower().replace(theNS,"").replace(theNS2,"")
    return item1

def convertTimeToSeconds(inTime):
    theSeconds = -1

    if (inTime.count(":")) == 2:
        try:
            inHour = inTime.split(":")[0]
            inMin = inTime.split(":")[1]
            inSec = inTime.split(":")[2]

            totalSec = float(inSec)
            totalSec += (float(inMin) * 60)
            totalSec += (float(inHour) * 3600)
            theSeconds = totalSec
        except:
            pass

    return theSeconds


def writeSHP(inSourceFile,inTrkList):
    w = shapefile.Writer(shapefile.POINT)
    w.field("file")
    w.field("segment","N","8",0)
    w.field("vertex","N","8",0)
    w.field("datetime","C",30)
    w.field("date","C","10",0)
    w.field("time","C","8",0)
    w.field("sec","N","8",0)
    w.field("isec","N","8",0)
    w.field("totsec","N","8",0)
    w.field("elev","N","24",14)
    w.field("hr","N","8",0)
    w.field("last","N","1",0)
    w.field("lat","N","24",16)
    w.field("lon","N","24",16)

    iTrkSegIndex = 0
    startSec =-1
    prevSec = -1
    for iTrkSeg in inTrkList:
        iTrkPtIndex = 0
        for iTrkPtDict in iTrkSeg:
            thisLine = "{0},{1},{2},*time*,*ele*,*hr*,*lat*,*lon*".format(inSourceFile,iTrkSegIndex,iTrkPtIndex)

            theLat = None
            if (iTrkPtDict.has_key('lat')):
                try:
                    theLat = float(iTrkPtDict['lat'])
                except:
                    pass

            theLon = None

            if (iTrkPtDict.has_key('lon')):
                try:
                    theLon = float(iTrkPtDict['lon'])
                except:
                    pass

            theDate = None
            theTime = None
            theSeconds = -1
            segSeconds = -1
            totSeconds = -1

            if (iTrkPtDict.has_key('time')):
                theDateTime = iTrkPtDict['time']
                if ("T" in theDateTime):
                    theDate = theDateTime.split("T")[0]
                    theTimePlue = theDateTime.split("T")[1]
                    if ("+" in theTimePlue):
                        theTime = theTimePlue.split("+")[0]
                        theSeconds = convertTimeToSeconds(theTime)

                        if (prevSec < 0):
                            prevSec = theSeconds
                        if (startSec<0):
                            startSec = theSeconds

                        segSeconds = theSeconds - prevSec
                        prevSec = theSeconds
                        totSeconds = theSeconds - startSec
            else:
                theDateTime = None

            if (iTrkPtDict.has_key('ele')):
                theElev = iTrkPtDict['ele']
            else:
                theElev = None

            if (iTrkPtDict.has_key('hr')):
                theHR = iTrkPtDict['hr']
            else:
                theHR = None

            if (iTrkPtIndex == len(iTrkSeg) - 1):
                theLast = 1
            else:
                theLast = 0

            w.point(theLon, theLat)
            try:
                                  w.record(inSourceFile.replace(".gpx",""),iTrkSegIndex,iTrkPtIndex,theDateTime,theDate,theTime,theSeconds,segSeconds,totSeconds,theElev,theHR,theLast,theLat,theLon)

            except:
                print "############## ERROR ####################"
            iTrkPtIndex+=1

        iTrkSegIndex+=1


    w.save(inSourceFile.lower().replace(".gpx",""))
    w = None
    if (os.path.exists(templatePRJfile)):
        newPRJFN = inSourceFile.lower().replace(".gpx",".prj")
        shutil.copyfile(templatePRJfile,newPRJFN)

def mainLoop():
    for iFile in glob.glob("*.gpx"):
        print iFile
        tree = ET.parse(iFile)
        root=tree.getroot()

        theTrkList = []

        for iRoot in root:
            if elementIs(iRoot,"trk"): #"http://www.topografix.com/gpx/1/1}trk" in iRoot.tag.lower():
                for iTrkSeg in iRoot:
                    if not elementIs(iTrkSeg,"trkseg"):
                        continue
                    thisTrk = []

                    pntIndex = 0
                    for iTrkPt in iTrkSeg:
                        if not elementIs(iTrkPt,"trkpt"):
                            continue
                        trkPntDict = dict()
                        trkPntDict["pntIndex"] = pntIndex
                        trkPntDict['lat'] = iTrkPt.get('lat')
                        trkPntDict['lon'] = iTrkPt.get('lon')

                        pntIndex+=1
                        for iElem in iTrkPt:
                            if elementIs(iElem,"extensions"):
                                for iSubElem in iElem:
                                    if (elementIs(iSubElem,"TrackPointExtension")):
                                        for iExtensionElem in iSubElem:
                                            if elementIs(iExtensionElem,"hr"):
                                                trkPntDict[elementName(iExtensionElem)] = iExtensionElem.text
                            else:
                                trkPntDict[elementName(iElem)] = iElem.text

                        #print trkPntDict
                        thisTrk.append(trkPntDict)

                    theTrkList.append(thisTrk)
        writeSHP(iFile.lower(), theTrkList)


theLineList = mainLoop()

 

Zipping a Shapefile from ArcCatalog

Back in 2010, I posted a python script and an ArcToolbox tool for zipping a shapefile.

Well, I had a request to modify the code so it would not error out if it encounters a .lock file. While .lock files exist for a reason and shouldn’t be totally ignored, in some cases it is safe to do so, so I went ahead any modified the code, which can be downloaded from Github.

The guts of the code is here, though:

import zipfile
import sys
import os
import glob

theShapeFile = sys.argv[1]
outputZipFile = sys.argv[2]
skipLockFile = sys.argv[3]

def zipShapefile(inShapefile, newZipFN, skipLockFile):
    print 'Starting to Zip '+inShapefile+' to '+newZipFN

    if not (os.path.exists(inShapefile)):
        print inShapefile + ' Does Not Exist'
        return False

    if (os.path.exists(newZipFN)):
        print 'Deleting '+newZipFN
        os.remove(newZipFN)

        if (os.path.exists(newZipFN)):
            print 'Unable to Delete'+newZipFN
            return False

    zipobj = zipfile.ZipFile(newZipFN,'w')

    for infile in glob.glob( inShapefile.lower().replace(".shp",".*")):
        print infile
        if not ((os.path.splitext(infile.lower())[1] == ".lock") and (skipLockFile.lower() == "true")):
            zipobj.write(infile,os.path.basename(infile),zipfile.ZIP_DEFLATED)

    zipobj.close()

    return True

zipShapefile(theShapeFile,outputZipFile,skipLockFile)
print "done!"

Friday Fave: Geodatabase Geek

This Friday Fave is more for utility than pleasure.

Unfortunately, I have been working to determine why my views and query layers perform so much worse than directly accessing my feature class.

My Googling led me to Geodatabase Geek, by Trevor Hart, Eagle Technology Group Ltd.  Trevor has some real good information about Geodatabases and also  gave a good lightening talk on Usage Reporting on ArcGIS 10.1 for Server at the 2013 ESRI International Developer’s Conference.

One tool he pointed out was Mxdperfstat for benchmarking the performance of your MXD. Trevor used it to compare the performance of a Feature Class vs Query Layer vs Spatial View. While the official version is available for ArcGIS 9.3 through 10.2, I do want to point out Hussein Nasser’s 10.1 version which he put out before the official 10.1 version came out (it’s not really a version, more of a work-around but I like his ingenuity).

My results were significantly different on our 10.0 database server, the spatial view I was testing was much slower.  The query for both the spatial view and query layer was simply “Select * from featureclass

So not sure what to make of the performance yet, I’ve got a spatial index made so not sure what else I can try.

ArcSDE 10.0 Performance
ArcSDE 10.0 Performance

 

Friday Fave: Tapiriik

This Friday Fave is a little bit different.

My interest in geospatial technologies (although we just called it GIS back then) largely because I wanted to measure my running routes more accurately and efficiently than the paper map & scrap of paper method I was using in the early 90s. When I was introduced to GIS, I knew what I was going to use it for.

Now that GPS technology is ubiquitous–I’m currently using four different GPS devices, at the same time, on my bike rides–I seldom have to use a map to measure my routes. I may still use MapMyRun.com to plan a route ahead of time if I’m running in a new area or trying to plan a loop of a certain distance but GPS has really made it so simple to just go out and run.

While there are several GPS options available, I have used Garmin ever since their 405 Forerunner came out. This was their first watch that didn’t get confused for a Timex-Sinclair 1000 strapped to your wrist.

Timex Sinclair 1000
Timex Sinclair 1000

Garmin’s watches upload their data to Gamin Connect, which works fine, but for a project I recently started, I wanted to down load all my data which Garmin Connect does not make easy. I had over 1,000 data logs and downloading them individually was not going to happen.

A little Googling led me to tapiriik.com, which allows you to share data amongst several different online services that endurance athletes might use including Runkeeper, Strava, Garmin Connect, SportTracks, DropBox.com, and Training Peaks.

https://tapiriik.com/

 

You can use Tapiriik for free (you just need to visit their website to start synchronization) or for $2 per year they will automatically synchronize data between your accounts. You just provide your account information for whichever sites you want to synchronize and either visit their site or pay $2 and it will automatically share data between your accounts.

I linked to my Garmin Connect and Dropbox accounts and tapiriik and after some chugging, I had .GPX files for all of my data.

It was easy and didn’t take very long. Definitely met my needs–I haven’t shared data between other services but I do use the desktop version of SportTracks and considered using their web version of the software but didn’t know how I would upload all my data.  Now I know.

This product definitely saved me a bunch of time. The one thing I wonder, though, is what does “Tapiriik” means?

 

Quick & Dirty arcpy: Bulk Changing Field Values

In mapping cross sections, our geologists often find themselves renaming their stratigraphic units midway, or at the end, of creating multiple cross sections.  This can cause a situation where we need to change multiple values in multiple fields in multiple feature classes–a situation that can get messy very fast.

Perfect situation for a quick & dirty arcpy script and, in this case, an ArcToolbox tool that can be downloaded.

This tool will change all feature classes in the O:\clay_cga\sand-distribution_model\dnrPackages\stratlines directory.

It will look at two fields, [strat] and [unit] and make these changes:

  • “go” becomes “gro”
  • “goc” becomes “grc”
  • “sgb” becomes “grb”

And since I have Case Sensitive checked, “Go” will not get changed to “gro”.  Also note that only full values that match values in the Old Value List get changed, part matches are left as-is so “got” would be left as-is even though the first two characters match “go”.

 

Bulk Field Change

import arcpy
import sys, string, arcgisscripting
import arcpy

def printit(inString):
    print inString
    arcpy.AddMessage(inString)

def printerr(inString):
    print inString
    arcpy.AddError(inString)

def fieldExists(tablename,indexname):

 if not arcpy.Exists(tablename):
  return False

 tabledescription = arcpy.Describe(tablename)

 for iField in tabledescription.fields:
     if (iField.Name.lower() == indexname.lower()):
         return True

 return False


if len(sys.argv) > 1:
    inDirectory = sys.argv[1]
    inFieldNameRaw = sys.argv[2]
    oldValue = sys.argv[3].replace(","," ")
    newValue = sys.argv[4].replace(","," ")
    caseSensitiveRaw = sys.argv[5]
else:
    inDirectory = r"C:\temp\test\stratest"
    inFieldNameRaw = "strat"
    oldValue = "go, goc, sgb".replace(","," ")
    newValue = "gro grc grb".replace(","," ")
    caseSensitiveRaw = "true"

caseSensitive = (caseSensitiveRaw.lower() == "true")
fieldNameList = inFieldNameRaw.replace(","," ").split()

printit("Starting")
printit(" Workspace: "+str(inDirectory))
printit( " inFieldName: "+str(inFieldNameRaw))
printit( " oldValue: "+str(oldValue))
printit( " newValue: "+str(newValue))
printit( " caseSensitive: "+str(caseSensitive))

valueDict = dict()

def initialQC():
    global valueDict

    if not (arcpy.Exists(inDirectory)):
        printerr("Workspace {0} does not exist".format(inDirectory))
        return False

    if (len(oldValue.split()) <> len(newValue.split())):
        printerr("Number of values in {0} does not equal number of values in {1}".format(oldValue,newValue))
        return False

    iValueIndex = 0
    for iOldValue in oldValue.split():
        if (caseSensitive):
            thisKey = iOldValue
        else:
            thisKey = iOldValue.lower()

        if (valueDict.has_key(thisKey)):
            printerr("ERROR: Value, {0}, is repeated, cancelling...".format(thisKey))
            return False

        valueDict[thisKey] = newValue.split()[iValueIndex]
        iValueIndex+=1
    return True

def makeFieldList(inFC):
    thisFieldList = []

    for iField in fieldNameList:
        if (fieldExists(inFC,iField)):
            thisFieldList.append(iField)

    return thisFieldList


def main():
    arcpy.env.workspace = inDirectory
    printit(valueDict)
    for iFC in arcpy.ListFeatureClasses():
        printit("Working on {0}".format(iFC))

        iFieldList = makeFieldList(iFC)
        if (len(iFieldList) == 0):
            printit(" No fields to change, Skipping...")
            continue

        rows = arcpy.UpdateCursor(iFC)

        changes = 0
        printit(" Changing Rows")
        for row in rows:
            iChange = 0
            for iField in iFieldList:
                iValue = str(row.getValue(iField))
                newValue = iValue

                if valueDict.has_key(iValue):
                    newValue = valueDict[iValue]
                else:
                    if not (caseSensitive):
                        if valueDict.has_key(iValue.lower()):
                            newValue = valueDict[iValue.lower()]

                if (newValue <> iValue):
                    printit("CHANGE {0}".format(newValue))
                    row.setValue(iField,newValue)
                    iChange+=1

            if (iChange > 0):
                changes+=1
                rows.updateRow(row)
        printit(" Made {0} changes".format(changes))
        del row
        del rows

    printit("Main")

if (initialQC()==True):
    main()

printit("Done")

Friday Fave: Custom Maps App for Android

I admit, I love picking up freebie maps. Whether it is from the front desk of a hotel or from the bicycle shop, there is a certain appeal to seeing what people put on maps. I have maps organic orchards, breweries, Minnesota authors, rails to trails, zoos, fictional places, race maps, and a variety of other things that someone felt the need to cartographize.

http://thefriends.org/wp-content/uploads/2012/12/mn_writers_on_the_map_web_download.pdf
http://thefriends.org/

So, with all these paper maps lying around, I was thrilled to find Custom Maps, a free app on Google Play.

This app allows you to take a picture (or use an image on your device) and georeference it.  You can then view your location on the map. I’ve tried it with a few maps and have been happy with the results.

Similar to georeferencing in a desktop application, you select points on the map and then the same points on a control. You need at lest two or three points. After doing it a couple of times, the process becomes pretty easy and it just takes a minute or two from taking the picture to viewing your current location on the map.

http://www.custommapsapp.com
http://www.custommapsapp.com

An example use case scenario for this app is you are lugging your family around at an overwhelming large amusement park and never quite sure where you are. You can take a picture of the map you picked up at the entrance, georeference it, and your phone will then show exactly where you are on that map and where the nearest loo is.

The biggest limitation I’ve seen so far is that there is a limit of 3-5 megapixel size limit on the image. Apparently this is an android limitation on how much memory an app can consume. But if you adjust your camera settings not to exceed this, you should be good.

So far, I’m enjoying this app. The author, Marko Teittinen (a good Finn name), has made the source code open source so I look forward to digging into it in more detail.

 

Friday Fave: Cartozia Tales

Obviously Cartographers belong in the same category as other superheroes like Superman, Batman, and Spiderman and we finally have a our own comic book to prove it.

Cartozia Tales is a collaborative effort of nine indy artists with two guest artist each issue.

Cartozia

They have an interesting plan, they’ve split the world into nine regions (what’s the name for the ninth of an area, nona-rant?) and the artist will tell a story from a different region each issue. They may build off the region’s story from the previous issue, continue on the story they told the previous issue, and start something fresh.

After being funded by a KickStarter campaign, they’ve shipped three issues already and have at least seven more promised. Think the project definitely deserve’s a look.

Friday Fave: Public Workout Maps

When I first found out about GIS, the first application that came to mind was using it to map my running routes. At that time, I was using paper maps and scraps of paper to measure how far I was running each day. GIS obviously offered a better method.

Almost twenty years later, GPS has become so common place that I think we have four or five devices in our household that have GPS capabilities and measuring my runs has become ridiculously simple.

Given that background, you can understand why I love projects like this project by Nikita Barsukov, who compiled publicly available data from Endomondo and created maps of public running workout tracks for a variety of European Cities, including this sample from Helsinki.

helsinki
Barsukov.Net

Flowing Data, using a similar concept, generated similar maps for several cities in the United States, including Minneapolis–some areas I know.

http://flowingdata.com/2014/02/05/where-people-run/#jp-carousel-33732
http://flowingdata.com/2014/02/05/where-people-run/

I’ve got about six years of running data of my own that I should use to generate a personal running route map but would be a fun project.

Got Topology?