Compare with Previous | Blame | View Log
forgottenislanderbot -- command reference fib-cadb2sql.py Usage: python fib-cadb2sql.py (output) (input...) where output is the name of the SQlite database file to write to. If it already exists, duplicate entries may be created, so this should not be used to update existing databases. input... is at least one tab-separated values file from http://www.gov.pe.ca/civicaddress/download/index.php3 with all values. Usually 3 will be used, one for each county. Example: python fib-cadb2sql.py database.sqlite queens.tsv kings.tsv prince.tsv fib-crawlbot.py Usage: python fib-crawlbot.py (database) (sparsity) (interval) (noCities) where database is the SQlite database file generated by fib-cadb2sql.py, containing the list of civic addresses. DSL statuses will be stored in the same database. Already-checked addresses will not be rechecked. sparsity is the minimum difference between civic numbers on a road to check. A sparsity of 10 would get #500, not #509, #510, and #550. The first address on each road will always be gotten. To get every address, use 1. interval is the delay in seconds to wait between requests to the server. Can be a float. The bot is single-threaded. Due to the database size, a delay of 1 is practical. noCities is whether to ignore the two cities or not, as they likely have DSL. Can be 'y' or 'n'. If the whole database is being fetched, this may as well be 'n'. Example: python fib-crawlbot.py database.sqlite 1 1 n fib-dbutil.py Usage: python fib-dbutil.py (database) (COMMAND) (OPTIONS) where database is the SQlite database to work on. COMMANDS: OPTIONS: --stats (dslmask) (sparsity) (noCities) Prints a table of totals of each DSL status. Useful for predicting the results of a given sparsity, and determining the final count of no-DSL addresses. --list (dslmask) (sparsity) (noCities) Prints out all addresses matching the given sparsity. Not very use- ful, for now. --kml (dslmask) (sparsity) (noCities) (outputKML) Generates a Keyhole Markup Language virtual-globe overlay, with DSL status represented by differently-coloured icons. Sparsity is important here, as Google Earth is not fun to use when displaying 68000 placemarks. Default colors are EMPTY: white, NODSL: yellow, BASIC: cyan, ULTRA: green. dslmask is which DSL-statuses to work with. It should be four y/n characters, like 'nynn' -- respectively, EMPTY (unchecked), NODSL, BASIC, and ULTRA. For --kml, 'nyyy' or 'nynn' are useful. For --stats, use 'yyyy'. sparsity is the minimum difference between civic numbers on a road to work with. A sparsity of 10 would get #500, not #509, #510, and #550. The first address on each road will always be used. To get every address, use 1. This should be 1 for --stats, but for --kml you'll probably want something lower, like 40. noCities is a y/n character specifying whether to exclude the two cities. I'd usually use 'n'. outputKML is the filename to write the KML to. Examples: python fib-dbutil.py database.sqlite --stats yyyy 1 n python fib-dbutil.py database.sqlite --list nynn 1000 n python fib-dbutil.py database.sqlite --kml nyyy 40 n map.kml All commands require all of their "options." The next major version will be integrated into one command, fib.py, which will offer actual options, using the optparse module, in addition to offering more options and an automated mode. Next Major Version -- so far fib.py Usage: python fib.py (COMMAND) [ OPTIONS ] where COMMAND is required, and is ONE of -h, --help show this help message and exit --all do everything -- get civic addresses, run bot, and produce KMLs --newdb download civic addresses and make a new, empty database --crawl query the Bell Aliant server, finding DSL statuses --kml generate a KML overlay from a database --stats print the totals of each DSL status --list print every selected address However, only --help, --kml, --stats, and --list work now. OPTIONS are optional, at last, and can include -h, --help show this help message and exit --db=FILE SQlite database file to work with. Default: database.sqlite --dslmask=MASK DSL types to process; y/n for each of EMPTY, NODSL, BASIC, & ULTRA. Overrides defaults: crawl=ynnn, stats=yyyy, kml=nyyy --sparsity=INT minimum difference in civic numbers along a road to process. Overrides defaults: crawl=1, stats=1, kml=40 --nocities don't process addresses in Charlottetown or Summerside --max process more addresses when using a >1 sparsity. --always-get=HOUSE ignoring sparsity, which address on the end of a road to always get. Values: 'none', 'first' (default), 'last', 'both'. --sql-where=SQL custom SQL SELECT conditions. Inserted directly after WHERE. Be careful! --coord-rect=latA,longA,latB,longB latitude/longitude coordinates limiting processing to addresses within the rectangle --tsv=TSV-FILE TSV database of civic addresses to add to SQlite database. Can appear multiple times. --override-robot-exclusion-standard override a robots.txt, instead of asking --quiet-fail don't ask about a robots.txt -- just exit --delay=SECONDS time to sleep between single-threaded requests to Bell Aliant server. Can be float. Default: 1.0 --no-connect Don't reconnect HTTP connection between requests. Experimental, will probably fail. --useragent=NAME user-agent for the bot to represent itself with. Default: forgottenislanderbot --pause-every=QUERIES number of requests to run between database-commit / kill-opportunity sleep --pause-time=SECONDS time to sleep at an intermittent pause --map=KMLFILE file to write KML to. Will be overwritten. Default: dslmap.kml --globe=[GE|NWW] virtual-globe to optimize KML for --civic-addresses-in-kml Label icons in KML by civic address. May raise privacy concerns, file size, lag. --no-highlight Disable highlightable icons in KML. Experimental. --kmz Make a compressed KMZ from the KML. --iconsize=FLOAT Size of KML icons. Default: 1.0 --icontheme=[NAME|ante*post.png] Icons to use in KML. Either builtin, like 'GM', or external, like 'dsl-*.png', where '*' comprises 'ERROR', 'EMPTY', 'NODSL', 'BASIC', and 'ULTRA', in separate files. It can also be a URL. --iconhotspot=x,y Pixel coords in an icon where it is aligned over the lat/long coords. --iconcolours=LIST Colours to tint KML icons, as comma-separated list of five colours like 'bf0000ff', in 'AABBGGRR' format. --pipeable Changes stdout flushing behavior. May cause problems on some systems. Not implemented. --verbose Prints additional information. Not implemented. --debug Prints massive amounts of debug information. Implemented as necessary. the only ones that work, of course, are the ones which control --kml, --stats, and --list